Drug related deaths in the United States of America

A case study

Introduction

The last couple of years, we have ofte read about drug overdoses being a big problem in the US. And of course we have all seen Netflix`s Narcos. This made us want to dig deeper into the understanding of the drug problem in the US. Robert R. Redfield the leader of Centers for Disease Control and Prevention in the US recently made an statement and said: “This statistics is a clear warning that we are loosing to many americans to early, to often by reasons we can prevent.” He is refering to the reasont studys of the statistic involving overdoses and suicides in the US. So in summary we just want to have a look at how bad things are in the land of dreams.

More specific we want to see wich states have the most or least OD’s, and scaled for population per state. We are also taking a look at the types of drugs involved in OD’s and try to see if there is correlation between OD’s, income, unemployment, weather and population. It would also be interesting to see if OD’s occure more or less during the different months in a year. At the end we will use our data to create a regression model that can explain OD’s, but don’t get your hopes up for that one. As we read more and more and run tested we quickly found out that it is not easy to explain why people end up overdosing on drugs since life is complicated, and humans are not rational. We will try to illustrate our founds in the most convenient way. We have been motivated to write this project by multiple couses as mentioned. Here is an example of a relavant article: https://www.nrk.no/urix/forventet-levealder-i-usa-faller-1.14324124 . We hope you enjoy our work. All the code is found on our github: https://github.com/omyrland/Data_Science_exam.git

The data

The data in the project is a result of scraping, gathering, massaging and arranging multiple data sets with tens of thousand observations. All the data can be found in the rmd document on our github. Feel free to download the file, take a look at the data and do your own calculations. Remember to download the css file as well to be able to compile it ass html in rmarkdown.

This project contains multiple sources of data and datasets:

-VSRR Provisional Drug Overdose Death Counts

The main source of data is contained in this dataset. It contains provisional counts for drug overdose deaths based on a current flow of mortality data in the National Vital Statistics System. National provisional counts include deaths occurring within the 50 states and the District of Columbia as of the date specified and may not include all deaths that occurred during a given time period. Provisional counts are often incomplete and causes of death may be pending investigation resulting in an underestimate relative to final counts.

Other smaller datasets:

-State population 2015 - Infoplease

-State population 2016 - Dilemma X

-State population 2017 - Wikipedia

-State population 2018 - World Population Review

-Local Area Unemployment Statistics

-Average Annual Sunshine by State

-Average Annual Temperature for Each US State

-Average Annual Precipitation by State

-Median Household income

-List of Latitudes and Longitudes for every State

Summary of study

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry’s standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

Interactive map with summary of every state

Click the popup icon to read about the statistics for each state.

Let’s take a look at the total number of deaths by drugs scaled for 10 000 inhabitants per state

We scale the data so the relationship between the number of deaths become equal for the number of inhabitants in every state. Then we divide all of the states into four regions:

Northeast

Midwest

South

West

Deaths by drugs for all states compared to one another in 2016:

How did they rank in 2017?

State Deaths per 10 000 inhabitant Rank
District of Columbia 68 1
West Virginia 64 2
Ohio 52 3
Pennsylvania 50 4
Maryland 46 5
Kentucky 43 6
Delaware 42 7
New Hampshire 41 8
Massachusetts 38 9
Rhode Island 38 10
Connecticut 35 11
Maine 34 12
Florida 34 13
Tennessee 32 14
Indiana 32 15
New Jersey 32 16
Michigan 32 17
Nevada 29 18
New Mexico 29 19
Missouri 28 20
North Carolina 27 21
Louisiana 27 22
Vermont 26 23
Oklahoma 26 24
Arizona 25 25
South Carolina 24 26
Utah 24 27
Illinois 24 28
Wisconsin 24 29
Colorado 22 30
Alaska 21 31
Virginia 21 32
Alabama 20 33
Washington 18 34
Georgia 18 35
Idaho 16 36
Hawaii 16 37
Arkansas 16 38
Minnesota 15 39
Wyoming 15 40
Oregon 15 41
California 15 42
New York 15 43
Montana 14 44
North Dakota 13 45
Mississippi 13 46
Kansas 13 47
Texas 13 48
Iowa 13 49
South Dakota 10 50
Nebraska 8 51

You probably don’t wanna take your kids on vacation to DC, West Virginia or Ohio

This is some serious numbers. The total average drug deaths per state for 2015 is 11 585 and worse, it increased to 15 856 in 2017. This is an increase of 36.9% from 2015 to 2017, implefying that it is a serious problem in the US. The total number of deaths by drugs was 590 825 in 2015, 682 084 in 2016 and 808 661 in 2017. This equals a total of 2 081 570 people just for the three years this case study is studying. That´s the same as the total population of Slovenia to put things in perspective. Wiped out over three years.

How many drug deaths do we find compared to all deaths?

1.9 % of all deaths in average in the US was drug related in 2015

2.2 % of all deaths in average in the US was drug related in 2016

2.45 % of all deaths in average in the US was drug related in 2017

Will low income and high unemployment result in high overdose rate?

2016 numbers

Our assupmption is that low income and high unemployment equals high od-rate. So we take a look at the top 10 states of high overdose rate, lowest income and highest unemployment.

Scatter plot for median income

State Deaths pr 10000 inhabitants Rank
West Virginia 53 1
New Hampshire 39 2
Ohio 39 3
District of Columbia 38 4
Rhode Island 38 5
Kentucky 36 6
Pennsylvania 36 7
Massachusetts 35 8
Maryland 34 9
Connecticut 30 10
State Median household income 2016 Rank
Mississippi 40528 51
Arkansas 42336 50
West Virginia 42644 49
Alabama 44758 48
Kentucky 44811 47
Louisiana 45652 46
New Mexico 45674 45
Tennessee 46574 44
South Carolina 46898 43
Oklahoma 48038 42

State The unemployment rate in percent of states labor force 2016 Rank
Alaska 6.9 51
New Mexico 6.7 50
District of Columbia 6.1 48
West Virginia 6.1 48
Louisiana 6.0 47
Alabama 5.9 46
Illinois 5.8 44
Mississippi 5.8 44
Nevada 5.7 43
California 5.5 42

As we can se this is not allways the case. West Wirginia stands out and is represented badly in all three categories. The people working in DC make good money, but both unemployment and od-rate are high. Kentucky is represented in the table for low income. Maryland is the states with highest income the last couple of years by a solid margin, and is the 9th worst place for od in the country. Othervise the similarities was not as strong as expected.

The correlation between overdoses and unemployment was 24.97% in 2016.

The correlation between overdoses and income was 1.15% in 2016.

But high income and low unemployment would equal low od-rate right?

State Deaths pr 10000 inhabitants Rank
Nebraska 7.5 1
South Dakota 9.3 2
North Dakota 10.9 3
Texas 11.7 4
Iowa 11.9 5
New York 12.6 6
Kansas 13.0 7
Mississippi 13.2 8
Minnesota 13.9 9
Montana 14.2 10
State Median household income 2016 Rank
Maryland 76067 1
Alaska 74444 2
New Jersey 73702 3
District of Columbia 72935 4
Hawaii 71977 5
Connecticut 71755 6
Massachusetts 70954 7
New Hampshire 68485 8
Virginia 66149 9
California 63783 10
State The unemployment rate in percent of states labor force 2016 Rank
New Hampshire 2.9 1
Hawaii 2.9 1
South Dakota 3.0 3
North Dakota 3.1 4
Nebraska 3.1 4
Vermont 3.2 6
Colorado 3.3 7
Utah 3.4 8
Iowa 3.6 9
Maine 3.8 10

Not for Maryland, thats for sure.

What about the states with High(good) and low(bad) temperatures?

State TempC Rank
Florida 22 1
Hawaii 21 2
Louisiana 19 3
Texas 18 4
Georgia 18 5
Mississippi 17 6
Alabama 17 7
South Carolina 17 8
Arkansas 16 9
Arizona 16 10
State TempC Rank
Alaska -3.0 1
North Dakota 4.7 2
Maine 5.0 3
Minnesota 5.1 4
Wyoming 5.6 5
Montana 5.9 6
Vermont 6.1 7
Wisconsin 6.2 8
New Hampshire 6.6 9
Michigan 6.9 10

blablabla

The correlation between overdoses and temperature was 31.98% in 2016.

The correlation between overdoses and population was 86.8% in 2016.

Average amount of drug overdoses per month 2015-2018 per state

Our observation is that there is presumably a linear increase in the average amount of deaths related to overdoses from month to month, and year to year from 2015 to 2018. The downcrease in 2018 is probably due to incomplete data.

Average number of deaths by drugs vs avarage number of total deaths 2015-2018

From our observations of the graphs and the correlation test (cor = 0.94) we can see a strong linear relationship between the mean for overdoses and the mean for deaths. However the p-value is > 0.05 making our observation likely to be insignificant.

Remember that number of incidents do not equal number of death, since people could be affected by several drugs and all of them would be registerd.

What will a linear regression model say about overdoses dependent on income, weather and unemployment?

## 
## Call:
## lm(formula = OD16_relation ~ Median_income16 + PrecipitationMM + 
##     Clear_days + Rate2016, data = comparedata)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.001323 -0.000648 -0.000169  0.000505  0.002511 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)  
## (Intercept)      3.35e-04   1.45e-03    0.23    0.818  
## Median_income16  1.02e-08   1.48e-08    0.69    0.496  
## PrecipitationMM  4.60e-07   3.84e-07    1.20    0.238  
## Clear_days      -3.18e-06   5.20e-06   -0.61    0.544  
## Rate2016         2.68e-04   1.34e-04    2.01    0.051 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9e-04 on 45 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.145,  Adjusted R-squared:  0.0687 
## F-statistic:  1.9 on 4 and 45 DF,  p-value: 0.126

as we can se

Heatmap of drug ratio US

As the heatmap showes, most of the overdoses is in the east of the US. This is most likely because there is more states and people living in this part of the country compared to the rest of the country.